Global Nash convergence of Foster and Young's regret testing
نویسندگان
چکیده
We construct an uncoupled randomized strategy of repeated play such that, if every player plays according to it, mixed action profiles converge almost surely to a Nash equilibrium of the stage game. The strategy requires very little in terms of information about the game, as players’ actions are based only on their own past payoffs. Moreover, in a variant of the procedure, players need not know that there are other players in the game and that payoffs are determined through other players’ actions. The procedure works for finite generic games and is based on appropriate modifications of a simple stochastic learning rule introduced by Foster and Young [12].
منابع مشابه
Regret Testing: A Simple Payo¤-Based Procedure for Learning Nash Equilibrium1
A learning rule is uncoupled if a player does not condition his strategy on the opponents payo¤s. It is radically uncoupled if a player does not condition his strategy on the opponents actions or payo¤s. We demonstrate a family of simple, radically uncoupled learning rules whose period-by-period behavior comes arbitrarily close to Nash equilibrium behavior in any nite two-person game. Keywor...
متن کاملRegret testing: learning to play Nash equilibrium without knowing you have an opponent
A learning rule is uncoupled if a player does not condition his strategy on the opponent’s payoffs. It is radically uncoupled if a player does not condition his strategy on the opponent’s actions or payoffs. We demonstrate a family of simple, radically uncoupled learning rules whose period-by-period behavior comes arbitrarily close to Nash equilibrium behavior in any finite two-person game.
متن کاملContinuous and Global Stability in Innovative Evolutionary Dynamics
Innovation plays a central role in the development of modern economies, as does the regret of those who have missed the opportunity to try a successful new strategy. In contrast to purely biological environments, where new strategies emerge mainly by random mutation, human societies tend to exhibit more deliberate, although possibly imperfect inventions of new strategies. In this paper, we stud...
متن کاملOn the convergence of no-regret learning in selfish routing
We study the repeated, non-atomic routing game, in which selfish players make a sequence of routing decisions. We consider a model in which players use regret-minimizing algorithms as the learning mechanism, and study the resulting dynamics. We are concerned in particular with the convergence to the set of Nash equilibria of the routing game. No-regret learning algorithms are known to guarantee...
متن کاملUnifying Convergence and No-Regret in Multiagent Learning
We present a new multiagent learning algorithm, RVσ(t), that builds on an earlier version, ReDVaLeR . ReDVaLeR could guarantee (a) convergence to best response against stationary opponents and either (b) constant bounded regret against arbitrary opponents, or (c) convergence to Nash equilibrium policies in self-play. But it makes two strong assumptions: (1) that it can distinguish between self-...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Games and Economic Behavior
دوره 60 شماره
صفحات -
تاریخ انتشار 2007